Efficient composite pattern finding from monad patterns

نویسندگان

  • Jianjun Zhou
  • Jörg Sander
  • Guohui Lin
چکیده

Automatically identifying frequent composite patterns in DNA sequences is an important task in bioinformatics, especially when all the basic elements (or monad patterns) of a composite pattern are weak. In this paper, we compare one straightforward approach to assemble the monad patterns into composite patterns to two other rather complex approaches. Both our theoretical analysis and empirical results show that this overlooked straightforward method can be several orders of magnitude faster. Furthermore, different from the previous understandings, the empirical results show that the runtime superiority among the three approaches is closely related to the insignificance of the monad patterns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding composite regulatory patterns in DNA sequences

Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus on monad patterns that correspond to relatively short contiguous strings. However, many of the actual regulatory signals are composite patterns that are groups of monad patterns that occur near eac...

متن کامل

New Algorithms for Finding Monad Patterns in DNA Sequences

In this paper, we present two new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l,d)k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(ntlσ), where t is the numbe...

متن کامل

Patterns for computational effects arising from a monad or a comonad

This paper presents equational-based logics for proving first order properties of programming languages involving effects. We propose two dual inference system patterns that can be instanciated with monads or comonads in order to be used for proving properties of different effects. The first pattern provides inference rules which can be interpreted in the Kleisli category of a monad and the coK...

متن کامل

An Efficient Range Partitioning Method for Finding Frequent Patterns from Huge Database

Data mining is finding increasing acceptance in science and business areas that need to analyze large amounts of data to discover trends that they could not otherwise find. Different applications may require different data mining techniques. The kinds of knowledge that could be discovered from a database are categorized into association rules mining, sequential patterns mining, classification, ...

متن کامل

From Tree Patterns to Generalized Tree Patterns: On Efficient Evaluation of XQuery

XQuery is the de facto standard XML query language, and it is important to have efficient query evaluation techniques available for it. A core operation in the evaluation of XQuery is the finding of matches for specified tree patterns, and there has been much work towards algorithms for finding such matches efficiently. Multiple XPath expressions can be evaluated by computing one or more tree p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 3 1  شماره 

صفحات  -

تاریخ انتشار 2007